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Abstract — We propose a general statistical inference 
framework to capture the privacy threat incurred by a 
user that releases data to a passive but curious adversary, 
given utility constraints. We show that applying this general 
framework to the setting where the adversary uses the 
self-information cost function naturally leads to a non- 
asymptotic information-theoretic approach for characteriz- 
ing the best achievable privacy subject to utility constraints. 
Based on these results we introduce two privacy metrics, 
namely average information leakage and maximum infor- 
mation leakage. We prove that under both metrics the 
resulting design problem of finding the optimal mapping 
from the user's data to a privacy-preserving output can 
be cast as a modified rate-distortion problem which, in 
turn, can be formulated as a convex program. Finally, we 
compare our framework with differential privacy. 

I. Introduction 

A. Motivation 

Increasing volumes of user data are being collected 
over wired and wireless networks, by a large number of 
companies who mine this data to provide personalized 
services or targeted advertising to users. As a conse- 
quence, privacy is gaining ground as a major topic in 
the social, legal, and business realms. This trend has 
spurred recent research in the area of theoretical models 
for privacy, and their application to the design of privacy- 
preserving services. Most privacy-preserving techniques, 
such as anonymization, k-anonymity [ 1 1 and differential 
privacy |2|, are based on some form of perturbation of 
the data, either before or after the data is used in some 
computation. These perturbation techniques provide pri- 
vacy guarantees at the expense of a loss of accuracy in 
the computation result, which leads to a privacy-accuracy 
trade-off. 

In this paper, we consider the general setting where 
a user wishes to release a set of measurements to an 
analyst who provides a service (e.g. a recommendation 
system), while keeping data that are correlated with these 
measurements private. On one hand, the analyst is a 
legitimate receiver for these measurements, from which 
he expects to derive some utility. On the other hand, the 
correlation of these measurements with the user's private 



data gives the analyst the ability to illegitimately infer 
private information. The tension between the privacy 
requirements of the user and the utility expectations of 
the analyst gives rise to the problems of privacy-utility 
trade-off modeling, and the design of release schemes 
minimizing the privacy risks incurred by the user, while 
satisfying the utility constraints of the analyst. 

B. Contributions 

Our contributions are three-fold. First, we propose a 
general statistical inference framework to capture the pri- 
vacy threat incurred by a user who releases information 
given certain utility constraints. The privacy risk is mod- 
eled as an inference cost gain by a passive but curious 
adversary upon observing the information released by 
the user In broad terms, this cost gain represents the 
"amount of knowledge" learned by an adversary about 
the private data after observing the user's output. The 
design problem of finding the optimal mapping from 
the user's information to a privacy-preserving output 
is formulated as an optimization problem where the 
cost gain of the adversary is minimized for a given set 
of utility constraints. This formulation is general and 
given in terms of minimizing both the average and the 
maximum cost gain, being applicable to different cost 
functions. 

Second, we apply this general framework to the 
case when the adversary uses the self-information cost 
function. We show how this naturally leads to a non- 
asymptotic information-theoretic framework to charac- 
terize the information leakage subject to utility con- 
straints. Based on these results we introduce two privacy 
metrics, namely average information leakage and maxi- 
mum information leakage. We also demonstrate that the 
problem of designing a privacy preserving mechanism 
that achieves the optimal privacy -accuracy tradeoff both 
for the average and maximum information leakage can 
be cast as modified rate-distortion problems. We then 
prove that these problems, in turn, can be expressed as 
convex programs. As a consequence, the privacy pre- 
serving mapping that achieves the optimal privacy-utility 



tradeoff can be efficiently found using convex minimiza- 
tion algorithms or widely available convex solvers. 

Finally, we compare the average information leakage 
and maximum information leakage metrics with differen- 
tial privacy. We show that differential privacy does not 
provide in general any privacy guarantees in terms of 
average or maximum information leakage. Furthermore, 
we introduce the definition of information privacy, and 
prove that information privacy implies both differential 
privacy and privacy in terms of (average or maximum) 
information leakage. 

C. Related Work 

In the privacy research community, a prevalent and 
strong notion of privacy is that of differential privacy [ 2 |, 
13 1 . Differential privacy bounds the variation of the dis- 
tribution of the released output given the input database, 
when the input database varies slightly, e.g. by a single 
entry. Intuitively, released outputs satisfying differential 
privacy render the distinction between "neighboring" 
databases difficult, distinguish between. However, dif- 
ferential privacy neither provides guarantees, nor an 
intuition, on the amount of information leaked when a 
differentially private release occurs. Moreover, user data 
usually presents correlations. Differential privacy does 
not factor in correlations in user data, as the distribution 
of user data is not taken into account in this model. A 
natural question is how the notion of privacy proposed 
in this paper compares to that of differential privacy. We 
cover this question in more details in Section [V] 

Several approaches rely on information-theoretic tools 
to model privacy-accuracy trade-offs, such as |^-|7J. 
Indeed, information theory, and more specifically rate- 
distortion theory, appear as natural frameworks to an- 
alyze the privacy-accuracy trade-off resulting from the 
distortion of correlated data. Although the approach we 
introduce in this paper involves information theoretic 
metrics, it is fundamentally different from previous in- 
formation theoretic privacy models. Indeed, traditional 
information theoretic privacy models, such as |5l, ||3, 
focus on collective privacy for all or subsets of the entries 
of a database, and provide asymptotic guarantees on the 
average remaining uncertainty per database entry - or 
equivocation per input variable - after the output release. 
More precisely, the average equivocation per entry is 
modeled as the conditional entropy of the input variables 
given the released output, normalized by the number 
of input variables. In contrast, the general framework 
introduced in this paper provides privacy guarantees in 
terms of bounds on the inference cost gain that an 
adversary achieves by observing the released output. The 
use of a self-information cost yields a non-asymptotic 



information theoretic framework modeling the privacy 
risk in terms of information leakage. This framework, in 
turn, can be used to design practical privacy preserving 
mappings. Finally, we would like to point out that the 
formulation in |4|, differs from previously mentioned 
information theoretic models, and addresses a particular 
case of the general framework introduced in this paper. 

The paper is organized as follows. We describe the 
set-up and the threat model in Section HIl and formulate 
the privacy-accuracy trade-off in Section Hill Our main 
results and their proofs are presented in Section |IV] 
Finally, in Section [V] we draw a comparison between the 
privacy notion proposed in this paper, and other existing 
privacy models, leading to the concluding remarks in 
Section [Vll 

II. General Setup and threat model 

In this section we outline the general setup considered 
in this paper and the corresponding threat model. 

A. General setup 

We assume that there are two parties that communicate 
over a noiseless channel, namely Alice and Bob. Alice 
has access to a set of measurement points, represented 
by the variable Y G y, that she wishes to transmit 
to Bob. At the same time, Alice requires that a set 
of variables S £ S should remain private, where S is 
jointly distributed with Y according to the distribution 
{Y,S) ^ pY,s{y,s), {y,s) £ y X S. Depending on the 
considered setting, the variable S can be either directly 
accessible to Alice or inferred from Y. If no privacy 
mechanism was in place, Alice would simply transmit 
Y to Bob. 

Bob has a utility requirement for the information sent 
by Alice. Furthermore, Bob is honest but curious, and 
will try to learn S from Alice's transmission. Alice's 
goal is to find and transmit a distorted version of Y, 
denoted by J7 e W, such that U satisfies a target utility 
constraint for Bob, but "protects" (in a sense made more 
precise later) the private variable S. We assume that Bob 
is passive but computationally unbounded, and will try 
to infer S based on U. 

We consider, without loss of generality, that 5* — > 
y — !■ [/. Note that this model can capture the case 
where S is directly accessible by Alice by appropriately 
adjusting the alphabet y. For example, this can be done 
by representing 5 — > F as an injective mapping or 
allowing C 3^. In other words, even though the privacy 
mechanism is designed as a mapping from y to U, it is 
not limited to an output perturbation, and it encompasses 
input perturbation settings. 
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Definition 1. A privacy preserving mapping is a proba- 
bilistic mapping g -.y -^U characterized by a transition 
probability Pu\Y{u\y), y&y, u&U. 

Since the framework developed here results in formu- 
lations that are similar to the ones found in rate-distortion 
theory, we will use the term distortion to indicate a mea- 
sure of utiUty. Furthermore, we will use the terms utility 
and accuracy interchangeably throughout the paper. 

Definition 2. Let d : y xU ^ M+ be a given distortion 
metric. We say that a privacy preserving mapping has 
distortion A if Ey,c/[d(r, [/)] < A. 

We make the following assumptions: 

1) Ahce and Bob know the prior distribution of 
Py,s{-)- This represents the side information that 
an adversary has. 

2) Bob has complete knowledge of the privacy pre- 
serving mapping, i.e., g and Pu\y{') are known. 

Note that this represents the worst-case statistical side 
information that an adversary can have about the input. 

B. Threat model 

We assume that Bob selects a revised distribution q € 
Vs, where Vs is the set of all probability distributions 
over iS, in order to minimize an expected cost C{S, q). 
In other words, the adversary chooses q as the solution 
of the minimization 

c5= minEs[C(5,(z)] (1) 

prior to observing U, and 

cl= uimEs\u[C{S,q)\U = u] (2) 

after observing the output U. Note that this restriction 
on Bob models a very broad class of adversaries that 
perform statistical inference, capturing how an adversary 
acts in order to infer a revised belief distribution over the 
private variables S when observing U. After choosing 
this distribution, the adversary can perform an estimate 
of the input distribution (e.g. using a MAP estimator). 
However, the quality of the inference is inherently tied 
to the revised distribution q. 

The average cost gain by an adversary after observing 
the output is 

AC7 = cS-Ec/[<]. (3) 

The maximum cost gain by an adversary is measured in 
terms of the most informative output (i.e. the output that 
give the largest gain in cost), given by 

AC* =cS-min<. (4) 

In the next section we present a formulation for the 
privacy-accuracy tradeoff based on this general setting. 



III. A GENERAL FORMULATION FOR THE 
PRIVACY- ACCURACY TRADEOFF 

A. The privacy-accuracy tradeoff as an optimization 
problem 

Our goal is to design privacy preserving mappings that 
minimize AC or AC* for a given distortion level A, 
characterizing the fundamental privacy-utihty tradeoff. 
More precisely, our focus is to solve optimization prob- 
lems over Pu\Y & T^u\Y of the form 

min AC or AC* (5) 

S. t. Ey,c;[rf(y,C/)] < A , (6) 

where Vu\y is the set of all conditional probabiUty 
distributions of U given Y . 

Remarlt 1. In the remainder of the paper we consider 
only one distortion constraint. However, it is straightfor- 
ward to generalize the formulation and the subsequent 
optimization problems to multiple distinct distortion con- 
straints Er,j/[di(y,C/)] < Ai,...,Ey,c/[(i„(y,[/)] < 
A„. This can be done by simply adding an additional 
Unear constraint to the convex program. 

B. Application examples 

We illustrate next how the proposed model can be 
cast in terms of privacy preserving queries and hiding 
features within data sets. 

1} Privacy-preserving queries to a database: The 
framework described above can be applied to database 
privacy problems, such as those considered in differential 
privacy. In this case we denote the private variable as a 
vector S = Si,...,Sn, where Sj £ S, 1 < j < n 
and Si,. . . ,Sn are discrete entries of a database that 
represent, for example, the entries of n users. A (not 
necessarily deterministic) function f : ^ y is 
calculated over the database with output Y such that 
Y = f{Si, . . . , Sn)- The goal of the privacy preserving 
mapping is to present a query output U such that the 
individual entries Si,. . . ,Sn are "hidden", i.e. the esti- 
mation cost gain of an adversary is minimized according 
to the previous discussion, while still preserving the 
utiUty of the query in terms of the target distortion 
constraint. We illustrate this case with the counting 
query, which will be a recurring example throughout the 
rest of this paper. 

Example 1 (Counting query). Let Si, . . . , Snhe entries 
in a database, and define: 

n 

y = /(5i,...,5„) = ^u(5,), (7) 
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where 

, . J 1 if X has property A, 
^a(x) = j q otherwise. 

In this case there are two possible approaches: (i) output 
perturbation, where Y is distorted directly to produce U, 
and (ii) input perturbation, where each individual entry 
Si is distorted directly, resulting in a new query output 
U. 

2) Hiding dataset features: Another important partic- 
ularization of the proposed framework is the obfuscation 
of a set of features S by distorting the entries of a data 
set Y. In this case |iS| ^ |3^|, and S represents a set 
of features that might be inferred from the data Y, such 
as age group or salary. The distortion can be defined 
according to the the utility of a given statistical learning 
algorithm (e.g. a recommendation system) used by Bob. 

IV. Privacy- ACCURACY tradeoff results 

The formulation introduced in the previous section is 
general and can be applied to different cost functions. 
In this section we particularize the formulation to the 
case where the adversary uses the self-information cost 
function, as discussed below. 

A. The self -information cost function 

The self information (or log-loss) cost function is 
given by 

C{S,q)^-\0gq{S). (8) 

There are several motivations for using such a cost 
function. For an overview of the central role of the 
self-information cost function in prediction, we refer the 
reader to ISJ. Briefly, the self-information cost function 
is the only local, proper and smooth cost function for 
an alphabet of size at least three. Furthermore, since the 
minimum self-information loss probability assignments 
are essentially ML estimates, this cost function is consis- 
tent with a "rational" adversary. In addition, the average 
cost-gain when using the self-information cost can be 
related to the cost gain when using any other bounded 
cost function IS). Finally, as we will see below, this 
minimization implies a "closeness" constraint between 
the prior and a posteriori probability distributions in 
terms of KL-divergence. In Section [V] we compare the 
resulting privacy measure with that of differential privacy 
and information-privacy . 

In the next sections we show how the cost minimiza- 
tion problems in (|5]l used with the self-information cost 
function can be cast as convex programs and, therefore, 
can be efficiently solved using interior point methods or 
widely available convex solvers. 



B. Average information leakage 

It is straightforward to show that for the log-loss func- 
tion Cq = H{S) and, consequently, c* = H{S\U = u), 
and, therefore 

AC^I{S;U)^Eu[D{psiu\\ps)h (9) 

where is the KL-divergence. The minimization 

(|5]l can the be rewritten according to the following 
definition. 

Definition 3. The average information leakage of a set 
of features S given a privacy preserving output U is 
given hy I{S; U). A privacy-preserving mapping 
is said to provide the minimum average information 
leakage for a distortion constraint A if it is the solution 
of the minimization 

min I{S;U) (10) 

PUIY 

s.t. EY,u[d{Y,U)] < A . (11) 

Observe that finding the mapping Pu\Y{'u\y) that 
provides the minimum information leakage is a modified 
rate-distortion problem. Alternatively, we can rewrite this 
optimization as 

min Eu[Dips\u\\ps)] (12) 

PU\Y 

S.t. EY,u[d{Y,U)] < A . (13) 

The minimization (fTZt has an interesting and intuitive 
interpretation. If we consider KL-divergence as a metric 
for the distance between two distributions, (fT2] | states 
that the revised distribution after observing U should be 
as close as possible to the a priori distribution in terms 
of KL-divergence. 

The following theorem shows how the the optimiza- 
tion in the previous definition can be expressed as a 
convex optimization problem. We note that this optimiza- 
tion is solved in terms of the unknowns Pu\y{'\') and 
Pu\s{'\')^ which are coupled together through a linear 
equality constraint. 

Tlieorem 1. Given psy{-, ■), a distortion function d{-, •) 
and a distortion constraint A, the mapping pu^yi'l') t^^'^t 
minimizes the average information leakage can be found 
by solving the following convex optimization (assuming 
the usual simplex constraints on the probability distri- 
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butions): 



sr^ST ^ I ^ ^ M Pu\s{u\s) 

}_^Pu\s{u\s)p s{s) log — , , (14) 



PU\Y:PU\S 



Pu{u) 



u&A s^S 

s.t. ^J2Pu\Yiu\y)PY{y)diu,y)<A, (15) 

^Pu\s{u\s) Vm,s, 
(16) 

5Z?'t^|5'("k)Ps'(s) =Pc/(") Vm. (17) 

Proof: Clearly the previous optimization is the same 
as ([Tol l. To prove the convexity of the objective function, 
note that h{x,a) = ax log a; is convex for a fixed 
a > and a; > 0, and, therefore, the perspective of 
gi{x,z,a) = axlog{x/z) is also convex in x and z for 
z > 0, a > f9l. Since the objective function ( fT4l l can 
be written as 

^^9{Pu\s{'u\s),pu{u),ps{s)), 

it follows the optimization is convex. In addition, since 
p{u) — !> p{u\s) — > Vii, the minimization is well 
defined over the probability simplex. ■ 

Remark 2. Note that the previous optimization can also 
be solved using a dual minimization procedure analogous 
to the Arimoto-Blahut algorithm |10| by starting at a 
fixed marginal probability pu{u), solving a convex min- 
imization at each step (with an added linear constraint 
compared to the original algorithm) and updating the 
marginal distribution. However, the above formulation 
allows the use of efficient algorithms for solving convex 
problems, such as interior-point methods. In fact, the 
previous minimization can be simplified to formulate 
the traditional rate-distortion problem as a single convex 
program, not requiring the use of the Arimoto-Blahut 
algorithm. 

Remark 3. The formulation in Theorem [T] can be easily 
extended to the case when U is determined directly from 
S, i.e. when Alice has access to S and the privacy 
preserving mapping is given by Pu\s{'\') directly. For 
this, constraint ( fT6] l should be substituted by 

X! PY\siy\s)Pu\Y,s{u\y, s) = Pu\siu\s) Vu, s, (18) 
vey 

and the following linear constraint added 

^Ps\Y{s\y)pu\Y,siu\y,s) =Pu\Y{u\y) Vu,2/, (19) 



with the minimization being performed over the variables 

Pu\Y,siu\y,s),Pu\Y{u\y) and p[7|s(u|s), with the usual 
simplex constraints on the probabilities. 

We now particularize the previous result for the case 
where F is a deterministic function of S. 

Corollary l.IfY is a deterministic function of S and 
S ^ Y U then the minimization in (llOl l can be 
simplified to a rate-distortion problem: 



min I{Y;U) 

PU\Y 

s. t. ¥.Y,u[d{Y,U)] < D . 



(20) 
(21) 



Furthermore, by restricting U = Y + Z and d{Y, U) 
d{Y — U), the optimization reduces to 



max H{Z) 

Pz 

s. t. Ez[d{Z)] < A . 



(22) 
(23) 



Proof: Since Y s a deterministic function of S and 
S ^Y ^U, then 

IiS;U) = I{S,Y;U) - I{Y;U\S) (24) 
= I{Y;U)+I{S;U\Y)~I(Y;U\S) (25) 
= I{Y; U), (26) 



where (|26] l follows from the fact that F is a deterministic 
function of S {I{Y-U\S) = Q) and S ^ Y ^ U 
(I{S; U\Y) = 0). For the additive noise case, the result 
follows by observing that H{Y\U) = H{Z). ■ 

C. Maximum information leakage 

The minimum over all possible maximum cost gains 
of an adversary that uses a log-loss function in (|4|i is 
given by 

C* = ma.xH{S) - H{S\U ^ u). 

The previous expression motivates the definition of max- 
imum information leakage, presented below. 

Definition 4. The maximum information leakage of a 
set of features S is defined as the maximum cost gain, 
given in terms of the log-loss function, that an adversary 
obtains by observing a single output, and is given by 
maxueu H{S) — H{S\U — u). A privacy-preserving 
mapping Pu\y{') is said to achieve the minmax infor- 
mation leakage for a distortion constraint A if it is a 
solution of the minimization 



minmax H{S) ~ H{S\U ■ 

pu\Y ueu 

s. t. E[d{U,Y)] < A 



(27) 
(28) 
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The following theorem demonstrates how the mapping 
that achieves the minmax information leakage can be 
determined as the solution of a related convex program 
that finds the minimum distortion given a constraint on 
the maximum information leakage. 

Theorem 2. Given ps,y(-, ■), a distortion function c?(-, •) 
and a constraint e on the maximum information leakage, 
the minimum achievable distortion and the mapping that 
achieves the minmax information leakage can be found 
by solving the following convex optimization ( assuming 
the implicit simplex constraints on the probability distri- 
butions): 

'^^'^ ^y^Pu\Y{u\y)PY{y)d{u,y) (29) 

s.t. ^PY\s{y\s)Pu\Y{u\y) =Pu\s{u\s) Vu,s, 

(30) 

'^Pu\s{u\s)ps{s) ^Pu{u)yu, (31) 

opuKu) + 2_^pu,s\u,s) log- 

se5 



PuW) 



< Vu, (32) 



where 5 = H{S) — e. Therefore, for a given value of 
A, the optimization problem in ( I27l i can be efficiently 
solved with arbitrarily large precision by performing a 
line-search over e S [0, H{S)] and solving the previous 
convex program at each step of the search. 

Proof: The convex program in dZTl i can be refor- 
mulated to return the minimum distortion for a given 
constraint e on the minmax information leakage as 



min E[d(C/,r)] 

PU\Y 



S.t. H{S\U = u)>5 



(33) 
(34) 



It is straightforward to verify that constraint ( |32] | can be 
written as ( |34] |. Following the same steps as the proof 
of Theorem [1] and noting that the function g2{x, z, a) = 
ax\og{ax/z) is convex for a, a: > 0, z > 0, it follows 
that ( [34l l and, consequently, ( |32] i. is a convex constraint. 
Finally, since the optimal distortion value in the previous 
program is a decreasing function of e, it follows that the 
solution of dZTl l can be found through a line-search in e. 

■ 

Remark 4. Analogously to the average information 
leakage case, the convex program presented in Theorem 
(lU can be extended to the setting where the privacy 
preserving mapping is given by Pu\s{'\') directly. This 
can be done by substituting dJTT l by (fTsT l and adding the 
linear constraint ( fT9l l. 



Even though the convex program presented in Theo- 
rem |2] holds in general, it does not provide much insight 
on the structure of the privacy mapping that minimizes 
the maximum information leakage for a given distortion 
constraint. In order to shed light on the nature of the 
optimal solution, we present the following result for the 
particular case when F is a deterministic function of S 
and ^ y ^ [/. 

Corollary 2. For Y = f{S), where f : S ^ y is a 
deterministic function, S ^ Y U and a fixed prior 
Py,s{ 'j ")> ^'^^ privacy preserving mapping that minimizes 
the maximum information leakage is given by 

Pf7|Y=argmin maxi:)(py|[/||C) (35) 

Pu\Y ueu 

s.t. E[d{U,Y)] < A, 

^H(S\Y = y) 



where ({y) = = H(siY=y') - 

Proof: Under the assumptions of the corollary, note 
that for a given u £U (and assuming that the logarithms 
are in base 2) 

H{S\U = u) = 

- '^Ps\u{s\u) \ogps\u{s\u) 
ses 

= -Yl Yps\Y{s\y)PY\u{y\u) 
ses \yey 

X log X! Ps\Y{s\y')PY\uiy'\u) 

= - Yps\Y{s\f{s))pY\u{f{s)\u) 
ses 

X ^ogps\Y{s\f{s))pY\u{f{s)\u) (36) 

= - Ps\Y{s\y)pY\u{y\u)\ogps\Y{s\y)pYiu{y\u) 

ses,yGy 

(37) 

- H{Y\U = u) + Y, PY\uiy\u)H{S\Y = y) (38) 

yey 

2H{S\Y=y) 

= 2^ PY\u{y\u) log (39) 

y&y 



PY\uiy\' 



= -D{pY\u\\0+^og ^2^(^l^=^) 1 , (40) 
\yey 

where ( |36] | and dJTl l follows by noting that Ps\Y{s\y) = 
if y 7^ f{s). The result follows directly by substituting 
(Hil in (EtIi. ■ 
For Y a deterministic function of 5", the optimal pri- 
vacy preserving mechanism is the one that approximates 
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(in terms of KL-divergence) the posterior distribution of 
Y given U to ({■). Note that the distribution ({■) captures 
the inherent uncertainty that exists in the function / for 
different outputs y E y. The purpose of the privacy 
preserving mapping is then to augment this uncertainty, 
while still satisfying the distortion constraint. In partic- 
ular, the larger the uncertainty H{S\Y = y), the larger 
the probability of Py\u{v\''A for Consequently, the 
optimal privacy mapping (exponentially) reinforces the 
posterior probability of the values of y for which there 
is a large uncertainty regarding the features S. This fact 
is illustrated in the next example, where we revisit the 
counting query presented in Example [T] 

Example 2 (Counting query continued). Assume that 
each database input Si,l<i<n satisfies Pr(lyi(S'i) = 
1) = p and are independent and identically distributed. 
Then F is a binomial random variable with parameter 
{n,p). It follows that H{S\Y = y) = log (p. Conse- 
quently, the optimal privacy preserving mapping will be 
the one that results in a posterior probability PY\u{yW) 
that is proportional to the size of the pre-image of y, i.e. 

py\u{y\u)^\r\y)\^^;)- 

V. Comparison of privacy metrics 

We now compare average information leakage and 
maximum information leakage with differential privacy 
and information privacy, the latter being a new metric 
introduced in this section. We first recall the definition of 
differential privacy, presenting it in terms of the model 
discussed in Section and assuming that the set of 
features S is a vector given by S = {Si, . . . ,Sn), where 
S,&S. 

Definition 5 ( |'3:|). A privacy preserving mapping 
Pu\s{-\-) provides e-differential privacy if for all inputs 
Si and S2 differing in at most one entry and all ECU, 

Pr{U e B\S = si) < exp(e) x Pr([/ e B\S = Ss) . 

(41) 

An alternative (and much stronger) definition of pri- 
vacy, related to the one presented in ||6l is given below. 
We note that this definition is unwieldy, but explicitly 
captures the ultimate goal in privacy: the posterior and 
prior probabilities of the features S do not change 
significantly given the output. 

Definition 6. A privacy preserving mapping Pu\s{'\') 
provides e-information privacy if for all s C 5": 

exp(-e) < — ' — — — < exp(e) vu eU : pu(u) > 0. 
Ps[s) 

(42) 



Note that e-information privacy implies directly 2e- 
differential privacy and maximum information leakage 
of at most e/ln2 bits, as shown below. 

Theorem 3. If a privacy preserving mapping Pu\s{:\-) 
is e-information private for some input distribution such 
that supp(j)jj) = lA , then it is at least 2e-differentially 
private and leaks at most e/ ln2 bits on average. 

Proof Note that for a given B CU 

Pi{U e B\S = si) _ Pr(S = Si\U e S)Pr(S = S2) 
Pr(C/ e B\S = S2) ^ Pr(S = S2IC/ G S)Pr(S = Si) 
< exp(2e), 

where the last step follows from (|4TI ). Clearly if Si and 
S2 are neighboring vectors (i.e. differ by only one entry), 
then 2e-differential privacy is satisfied. Furthermore 

H{S) - H{S\U = - ^ psiuis\u)pu{u) log ^ 

< PS\u{s\u)pu{u)r^ 

e 

■ 

We show in the next theorem that differential privacy 
does not guarantee privacy in terms of average infor- 
mation leakage in general and, consequently in terms of 
maximum information leakage and information privacy. 
More specifically, guaranteeing that a mechanism is e- 
differentially private does not provide any guarantee on 
the information leakage. 

Tlieorem 4. For every e > and d > 0, there exists an 
n G Z+, sets S"" and lA, a prior ps{-) over 5" and a 
privacy mapping Pu\s{'\') that is e-differentially private 
but leaks at least S bits on average. 

Proof: We prove the statement by explicitly con- 
structing an example that is e-differentially private, but 
an arbitrarily large amount of information can leak on 
average from the system. For this, we return to the 
counting query discussed in examples [T] and |2] with, 
the sets S and y being defined accordingly, and letting 
U = y. We do not assume independence of the inputs. 

For the counting query and for any given prior, adding 
Laplacian noise to the output provides e-differential 
privacy [3|. More precisely, for the output of the query 
given in (|7]i, denoted as Y ^ pY{y),0 < y < n, the 
mapping 

U = Y + N, A^-Lap(l/e), (43) 
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where the pdf of the additive noise N given by 

PN{r;€) = exp(-|r|e), 



(44) 



is e-differentially private. Now assume that e is given, 
and denote S = {Xi, . . . , Set k and n such that n 
mod fc = 0, and let ps(-) be such that 

^ if y mod k ~ 0, 



Pviv) { Q otherwise. 



(45) 



With the goal of lower-bounding the information leak- 
age, assume that Bob, after observing U, maps it to the 
nearest value of y such that pviy) > 0, i.e. does a 
maximum a posteriori estimation of Y. The probability 
that Bob makes a correct estimation (and neglecting edge 
effects), denoted by afc.„(e), is given by: 



afc,„(e) 



e /fee 
-k 2 ^^P^^l^l^)'^^ = 1 - exp I --^ 



(46) 

Let be a binary random variable that indicates the 
event that Bobs makes a wrong estimation of Y given 
U. Then 

IiY;U)>IiE,Y;U)-l 

>I{Y;U\E)-l 

> Pr{E = 0}I{Y: U\E = 0) - 1 

.(i-.-*),„.(i + ^)-i. 

which can be made arbitrarily larger than S by appro- 
priately choosing the values of n and k. Since Y is 
a deterministic function of S, I{Y;U) = I{S;U), as 
shown in the proof of Corollary [T] and the result follows. 

■ 

The counterexample used in the proof of the previous 
theorem can be extended to allow the adversary to 
recover exactly the inputs generated the ouput U. This 
can be done by assuming that the inputs are ordered 
and correlated in such a way that Y = y if and only if 
Si ~ I, . . . , Sy — 1. In this case, for n and k sufficiently 
large, the adversary can exploit the input correlation to 
correctly learn the values of Si, . . . ,Sn with arbitrarily 
high probability. 

Differential privacy does not necessarily guarantee 
low leakage of information - in fact, an arbitrarily 
large amount of information can be leaking from a 
differentially private system, as shown in Theorem |4] 
This is a serious issue when using solely the differential 
privacy definition as a privacy metric. In addition, it 
follows as a simple extension of lilll Prop. 4.3] that 
I{S; U) < 0{en), corroborating that differential privacy 
does not bound above the average information leakage 
when n is sufficiently large. 



Nevertheless, differential privacy does have an oper- 
ational advantage since it does not require any prior 
information. However, by neglecting the prior and re- 
quiring differential privacy, the resulting mapping might 
not be de facto private, being suboptimal under the 
information leakage measure. We note that the presented 
formulations can be made prior independent maximizing 
the minimum information leakage over a set of possible 
priors. This problem is closely related to universal coding 
[10|. 

VL Conclusions 

In this paper we presented a general statistical infer- 
ence framework to capture the privacy threat incurred 
by a user that releases data to a passive but curious 
adversary given utility constraints. We demonstrated how 
under certain assumptions this framework naturally leads 
to an information-theoretic approach to privacy. The 
design problem of finding privacy-preserving mappings 
for minimizing the information leakage from a user's 
data with utility constraints was formulated as a con- 
vex program. This approach can lead to practical and 
deployable privacy-preserving mechanisms. Finally, we 
compared our approach with differential privacy, and 
showed that the differential privacy requirement does not 
necessarily constrain the information leakage from a data 
set. 
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